Page Layout Analysis System for Unconstrained Historic Documents

نویسندگان

چکیده

Extraction of text regions and individual lines from historic documents is necessary for automatic transcription. We propose extending a CNN-based baseline detection system by adding line height block boundary predictions to the model output, allowing extract more comprehensive layout information. also show that pixel-wise orientation prediction can be used processing with multiple orientations. demonstrate proposed method performs well on cBAD dataset. Additionally, we benchmark newly introduced PERO dataset which make public.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Page Layout Classification Technique for Biomedical Documents

The structural layout information of scanned document pages is valuable for a wide range of document processing applications such as automatic document searching, document delivery and automated data entry. This paper describes the classification of scanned document pages into different classes of physical layout structures. The page layout classification technique proposed in this paper uses a...

متن کامل

Adaptive layout for interactive documents

In many application domains there is a strong need to produce content both for traditional print media and for interactive media. In order to fully benefit from digital devices, online documents must provide mechanisms to support interactivity and for the personalization of content. Thus, powerful authoring tools as well as flexible layout techniques are needed to display dynamic information ef...

متن کامل

Eye-tracking Analysis for Automatic Documents Eye-catching Layout Retrieval

In this paper we present a synthesis of experiments of eye movement pursuit that have been applied to documents structure retrieval. The aim of this work is to propose a representation of structured documents content (the physical layout) through the simulation of a possible human inspired scan path. The research project which is presented here is based on the hypotheses that the analysis and t...

متن کامل

IJEL 4/1 page layout

Instructional text, and procedural text in particular, is a genre that users heavily rely upon when they are learning new procedures, devices or systems. It is, however, also well-known to be a genre that is difficult to produce and maintain. This article discusses Isolde, an environment that attempts to address this problem by supporting the semi-automated production of procedural instructions...

متن کامل

IJEL 4/3 page layout

Studies have shown that when learning occurs in an environment that uses animated pedagogical agents and personalized instruction, the learner learns the material more deeply and can recall it easier when compared to learning without an agent. Thus, an effective learning system creates personalized contexts for each learner. The “one size fits all” concept is not very effective across a large n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-86331-9_32